DSL trellis encoding

ABSTRACT

A method is used that substantially simultaneously trellis encodes data to be modulated onto multiple tones. The embodiments of the present invention comprise the steps of: (a) using a first input operand comprising state bits for a first trellis stage; (b) using a second input operand comprising a plurality of input data bits; and (c) generating an output comprising output data bits and output state bits from a first or later trellis stage.

RELATED APPLICATIONS

This application is a continuation of application Ser. No. 10/949,517,filed Sep. 27, 2004 now U.S. Pat. No. 7,305,608, which claims thebenefit of U.S. provisional application No. 60/505,720, filed on Sep.25, 2003, both of which are incorporated by reference herein in itsentirety.

FIELD OF THE INVENTION

The present invention relates generally to Digital Subscriber Line(“DSL”) systems, trellis encoding, and the design of instructions forprocessors. More specifically, the present invention relates to asystem, method and processor instruction for DSL trellis encoding.

BACKGROUND OF THE INVENTION

Trellis encoding is a way of encoding data using a convolutional codeprior to modulation such that the original data can be recovered at thereceiver, even in the presence of a certain amount of noise on thereceived signal.

In national and international standards for DSL (digital subscriberline) technologies such as ADSL (e.g., ITU-T Recommendation G992.1entitled “Asymmetrical digital subscriber line (ADSL) transceivers,”ITU-T Recommendation G992.3 entitled “Asymmetric digital subscriber linetransceivers-2 (ADSL2),” and ITU-T Recommendation G992.4 entitled“Splitterless asymmetric digital subscriber line transceivers 2(splitterless ADSL2)” which are all incorporated by reference herein intheir entireties) a particular form of trellis encoding is used formapping a set of input data bits U={u₁, u₂, . . . , u_(z)} and inputstate bits S={s₀, s₁, s₂, s₃} onto two sets of output data bits V={v₀,v₁, . . . , v_(x−1)}, W={w₀, w₁, . . . , w_(y−1)} and output state bitsS′={s′₀, s′₁, s′₂, s′₃}. V and W are subsequently encoded using QAM(quadrature amplitude modulation) onto a pair of tones in a DMT(discrete multi-tone) scheme, the two tones being encoded withrespectively x-bit and y-bit QAM constellations. (Note that x+y=z+1; inother words, one more bit is produced in the V and W output data bitsthan were taken in as input data bits U). The process is then repeatedwith S′ forming the input state for the trellis encoding of the next setof input data bits U′ for the next tone-pair, yielding output data bitsV′ and W′, and output state bits S″, and so on.

According to the applicable standards, the equations governing theoutput are as follows:v₀=u₃v₁=u₁

u₃v_(n)=u_(n+2), for n=2 to (x−1)w₀=u₂

u₃w₁=s₀

u₁

u₂

u₃w_(n)=u_(n+x), for n=2 to (y−1)s′₀=s₁

s₃

u₁s′₁=s₂

u₂s′₂=s₀s′₃=s₁

The symbol

represents the logical exclusive-OR operation.

An alternative naming scheme used hereafter is for input U to beidentified as U(0), U′ as U(1), etc., output V to be identified as V(1),V′ as V(2), etc., output W to be identified as W(1), W′ as W(2), etc.,input S to be identified as S(0), output or input S′ to be identified asS(1), output or input S″ to be identified as S(2) etc.

In older designs for transmission systems using trellis encoding (suchas DSL modems), which are in general more hardware oriented, the trellisencoding of data, for subsequent modulation of tones for transmission,is typically performed by fixed-function logic circuits. However, suchsystem designs are commonly hard to adapt for varying applicationrequirements. In order to increase flexibility in modem development andapplication, it has become more common to use software to perform thevarious functions in a DMT-based transmitting device. As the variousperformance levels (such as data-rates) required of such devicesincrease, the pressure on the software to perform efficiently theindividual processing tasks (such as trellis encoding), which make upthe overall transmitter function, likewise increases.

One reason is that performing the trellis encoding operation purely insoftware is typically quite complex to implement. Using conventionalinstructions (e.g. bit-wise shift, bit-wise and, bit-wise exclusive-OR,etc.) may take many cycles, or even tens of cycles, to perform trellisencoding for a single tone-pair. In some circumstances there may behundreds or even thousands of tones for which the associated data bitsmust be encoded, per transmitted symbol, and several thousand symbolsper second may need to be transmitted.

The trellis encoding process can therefore represent a significantproportion of the total computational cost for a software-based DMTtransmitter, especially in the case of a system where one processorhandles the operations for multiple independent transmission channels(e.g., in a multi-line DSL modem in the central office). With increasingworkloads (in respect of the average number of tones used in eachtransmission channel), it becomes necessary to improve the efficiency oftrellis encoding of data in such software-based DMT transmitters.

Therefore, what is needed is a system and method that significantlyreduce a number of cycles needed for software to perform trellisencoding of data in accordance with a mapping scheme specified ininternational standards.

SUMMARY OF THE INVENTION

According to the present invention, these objects are achieved by asystem and method as defined in the claims. The dependent claims defineadvantageous and preferred embodiments of the present invention.

The embodiments of the present invention provide a method, apparatus andprocessing instruction for trellis encoding data for subsequentmodulation onto one or more tone-pairs. In general, the presentinvention comprises the steps of: (a) using a first input operandcomprising input state bits; (b) using a second input operand comprisinga plurality of input data bits; and (c) generating an output comprisingtrellis-encoded data bits and output state bits from a trellis encodingstage.

In one embodiment, the first input operand comprises a value of at leastfour bits (e.g. 16 bits, 32 bits or 64 bits) and the second inputoperand comprises a value of at least 30 bits (e.g. 32 bits, or 64bits). Four bits of the first input operand may comprise the input statebits S(0) for a trellis stage. The second input operand comprises theinput data bits U(0). The output comprises 2 outputs: a state outputcomprising the state bits S(1) from the trellis encoding stage and adata output comprising data bits V(1) and W(1). In this embodiment, thepresent invention performs the trellis encoding for one pair of tones.

In another embodiment, the first and second input operands each comprisea 64-bit value. Four bits of the 64-bits of the first input operand maycomprise the input state bits S(0) for a first trellis stage. The secondinput operand comprises a first and second field of 32-bits each, andthe first field comprises the input data bits U(0) for a first trellisstage, and the second field comprises the input data bits U(1) for asecond trellis stage. The output comprises two 64-bit outputs: a stateoutput comprising the state bits S(2) from a second trellis stage and adata output comprising data bits V(1) and W(1) from a first trellisstage, and data bits V(2) and W(2) from a second trellis stage. In thisembodiment, the present invention performs the trellis encodingsubstantially simultaneously for two pairs of tones (i.e. four tones).

Further embodiments, features, and advantages of the present inventions,as well as the structure and operation of the various embodiments of thepresent invention, are described in detail below with reference to theaccompanying drawings.

BRIEF DESCRIPTION OF SEVERAL VIEWS OF THE DRAWINGS

The accompanying drawings, which are incorporated herein and form a partof the specification, illustrate the present invention and, togetherwith the description, further serve to explain the principles of theinvention and to enable a person skilled in the pertinent art to makeand use the invention.

FIG. 1 illustrates a block diagram of a communications system inaccordance with the present invention.

FIG. 2 illustrates a block diagram of a processor in accordance with oneembodiment of the present invention.

FIG. 3A illustrates an instruction format for a three-operandinstruction supported by the processor in accordance with one embodimentof the present invention.

FIG. 3B illustrates an instruction format for trellis encoding inaccordance with one embodiment of the present invention.

DETAILED DESCRIPTION OF THE INVENTION

The present invention will now be described in detail with reference toa few preferred embodiments thereof as illustrated in the accompanyingdrawings. In the following description, numerous specific details areset forth in order to provide a thorough understanding of the presentinvention. It will be apparent, however, to one skilled in the art, thatthe present invention may be practiced without some or all of thesespecific details. In other instances, well known processes and stepshave not been described in detail in order not to unnecessarily obscurethe present invention.

Embodiments of the present invention provide an instruction or aninstruction mechanism (“the instruction mechanism”) that significantlyreduces a number of cycles needed to perform to perform trellis encodingof data by a processor. In one embodiment, the trellis encoding of datais done in accordance with the mapping scheme specified in internationalstandards for DSL. It is to be appreciated this present invention can beused in other applications of DMT transmission where the same mappingscheme is used. A simple embodiment of the invention can implement thetrellis encoding process of data for modulation onto one pair of tones.However, one skilled in the art will appreciate that the presentinvention is not restricted to this number of tones but may be used totrellis encode data to be modulated onto any number of tones ortone-pairs. For example, through the application of SIMD techniques andthe combination of multiple instances of the basic trellis encodingequations (i.e. multiple stages of trellis encoding) described in moredetail below, the instruction mechanism can directly implement thetrellis encoding process substantially simultaneously for two or moreencoding stages. For the case of encoding data for two pairs of tones,the trellis-encoding stages can be represented by:

Stage 1: (U(0), S(0))→(V(1), W(1), S(1)) (U(0) is z bits long, V(1) is xbits, W(1) is y bits)

Stage 2: (U(1), S(1))→(V(2), W(2), S(2)) (U(1) is z′ bits long, V(2) isx′ bits, W(2) is y′ bits)

As used herein, the notation S(0) represents the state input bits for afirst trellis stage and S(N) represents the state output of the Nthstage for an N-tone-pair version. Thus, for example, S(1) represents thestate output bits from a first trellis stage, and S(2) represents thestate output bits from a second trellis stage. For the input data bits,U, the notation U(0) represents the data input bits for a first trellisstage and the notation U(1) represents the data input bits for a secondtrellis stage. The notation V(N) and W(N) represent the data output bitsof the Nth stage for an N-tone-pair version. Thus, for example, V(1) andV(2) represent the data output bits from a first and second trellisstage respectively, and W(1) and W(2) represent the data output bitsfrom a first and second trellis stage respectively.

In general, the present invention provides a method, apparatus andprocessing instruction for substantially simultaneously trellis encodingdata for subsequent modulation onto a plurality of tones by: (a) using afirst input operand comprising input state bits for a first trellisstage; (b) using a second input operand comprising a plurality of inputdata bits; and (c) generating an output comprising (i) output data bits,and (ii) output state bits from a first or later trellis stage.

In one embodiment, the trellis encoding instruction mechanism takes asone input a 64-bit value comprising the input state bits S(0) for thefirst trellis stage, and as a second input a 64-bit value comprising two32-bit fields wherein each field contains the U bits to be encoded for arespective trellis stage (i.e. a first field contains U(0) bits for thefirst trellis stage and the second field contains U(1) bits for thesecond trellis stage), and produces two outputs. The first output valueis a 64-bit value comprising the four output state bits S(2) from thesecond trellis stage, along with 60 other bits which are unused. Thesecond output value is also 64-bits comprising the V(1) and W(1) outputsfrom the first trellis stage, and the V(2) and W(2) outputs from thesecond stage, respectively.

While specific configurations and arrangements are discussed, it shouldbe understood that this is done for illustrative purposes only. A personskilled in the pertinent art will recognize that other configurationsand arrangements can be used without departing from the spirit and scopeof the present invention. It will be apparent to a person skilled in thepertinent art that this invention can also be employed in a variety ofother applications.

Embodiments of the invention are discussed below with references toFIGS. 1 to 3. However, those skilled in the art will readily appreciatethat the detailed description given herein with respect to these figuresis for explanatory purposes as the invention extends beyond theselimited embodiments.

Referring now to FIG. 1, there is shown a block diagram of acommunications system 100 in accordance with one embodiment of thepresent invention. System 100 provides traditional voice telephoneservice (plain old telephone service—POTS) along with high speedInternet access between a customer premise 102 and a central office 104via a subscriber line 106. At the customer premise end 102, variouscustomer premise devices may be coupled to the subscriber line 106, suchas telephones 110 a, 110 b, a fax machine 112, a DSL CPE (CustomerPremise Equipment) modem 114 and the like. A personal computer 116 maybe connected via DSL CPE modem 114. At the central office end 104,various central office equipment may be coupled to the subscriber line106, such as a DSL CO (Central Office) modem 120 and a POTS switch 122.Modem 120 may be further coupled to a router or ISP 124 which allowsaccess to the Internet 126. POTS switch 122 may be further coupled to aPSTN 128.

In accordance with one embodiment of the present invention, system 100provides for data to be sent in each direction as a data stream betweenthe central office 104 and the customer premise 102 via subscriber line106. As data is sent from the central office 104 to the customer premise102, the DSL CO modem 120 at the central office 104 can trellis encodethe data in accordance with the principles of the present inventionbefore modulating and transmitting the data via subscriber line 106.Similarly, when data is sent from the customer premise 102 to thecentral office 104, the DSL CPE modem 114 at the customer premise 102can trellis encode the data in accordance with the principles of thepresent invention before modulating and transmitting the data viasubscriber line 106. In a preferred embodiment, DSL CO modem 120incorporates a BCM6411 or BCM6510 device, produced by BroadcomCorporation of Irvine, Calif., to implement its various functions.

Referring now to FIG. 2, there is shown a schematic block diagram of thecore of a modem processor 200 in accordance with one embodiment of thepresent invention. In a preferred embodiment, processor 200 is theBroadcom FirePath processor used in the BCM6411 and BCM6510 devices. Theprocessor 200 is a 64 bit long instruction word (LIW) machine consistingof two execution units 206 a, 206 b. Each unit 206 a, 206 b is capableof 64 bit execution on multiple data units, (for example, four 16 bitdata units at once), each controlled by half of the 64 bit instruction.The execution units, 206 a, 206 b, may include single instruction,multiple data (SIMD) units.

SIMD stands for “Single Instruction Multiple Data” and describes a styleof digital processor design in which a single instruction can be issuedto control the processing of multiple data values in parallel (all beingprocessed in the same manner). SIMD operations can be implemented in adigital processor, such as Broadcom's FirePath digital processor design,by data processing units which receive multiple input values, each 64bits wide but capable of being logically subdivided into and treated asmultiple smaller values e.g. 8×8-bit values, 4×16-bit values, or2×32-bit values.

To illustrate SIMD working as used in FirePath, consider the FirePathinstruction:

ADDH c, a, b

The instruction mnemonic ADDH is an abbreviation for “Add Half-words.”The instruction “ADDH c, a, b” takes as input two 64-bit operands fromregisters a and b, and writes its result back to register c. ADDHperforms four 16-bit (“half-word”) additions: each 16-bit value in a isadded to the corresponding 16-bit value within b to produce 4×16-bitresults in the 64-bit output value c. Thus, this SIMD method allows fora great increase in computational power compared with earlier types ofprocessors where an instruction can only operate on a single set ofinput data values (e.g. one 16-bit operand from a, one 16-bit operandfrom b giving one 16-bit result in c). For situations where the sameoperation is to be performed repeatedly across an array of values, whichis common in digital signal processing applications, it allows in thisinstance an increase in speed by a factor of four of the basicprocessing rate, since four add operations can be performed at oncerather than only one.

Processor 200 also includes an instruction cache 202 to holdinstructions for rapid access, and an instruction decoder 204 fordecoding the instruction received from the instruction cache 202.Processor 200 further includes a set of MAC Registers 218 a, 218 b, thatare used to improve the efficiency of multiply-and-accumulate (MAC)operations common in digital signal processing, sixty four (or more)general purpose registers 220 which are preferably 64 bits wide andshared by execution units 206 a, 206 b, and a dual ported data cache orRAM 222 that holds data needed in the processing performed by theprocessor. Execution units 206 a, 206 b further comprise multiplieraccumulator units 208 a, 208 b, integer units 210 a, 210 b, trellisencoding units 212 a, 212 b, Galois Field units 214 a, 214 b, andload/store units 216 a, 216 b.

Multiplier accumulator units 208 a, 208 b perform the process ofmultiplication and addition of products (MAC) commonly used in manydigital signal processing algorithms such as may be used in a DSL modem.

Integer units 210 a, 210 b, perform many common operations on integervalues used in general computation and signal processing.

Galois Field units 214 a, 214 b perform special operations using Galoisfield arithmetic, such as may be executed in the implementation of thewell-known Reed-Solomon error protection coding scheme.

Load/store units 216 a, 216 b perform accesses to the data cache or RAM,either to load data values from it into general purpose registers 220 orstore values to it from general purpose registers 220. They also provideaccess to data for transfer to and from peripheral interfaces outsidethe core of processor 200, such as an external data interface for ATMcell data.

Trellis encoding units 212 a, 212 b directly implement the trellisencoding process for the processor 200. These units may be instantiatedseparately within the processor 200 or may be integrated within anotherunit such as the integer unit 210. In one embodiment, each trellisencoding unit 212 a, 212 b receives a first input operand comprising theinput state bits S(0) for a first trellis stage, a second input operandcomprising the input data U bits (i.e. input data bits U(0) for a firsttrellis stage and input data U(1) bits for a second trellis stage), andgenerates an output comprising output state bits S(1) and data outputbits V(1), W(1), V(2), W(2).

Referring now to FIG. 3A, there is shown an example of an instructionformat for a three-operand instruction supported by the processor 200.In one embodiment, the instruction format includes 14 bits of opcode andcontrol information, and three six-bit operand specifiers. As will beappreciated by one skilled in the art, exact details such as the size ofthe instruction in bits, and how the various parts of the instructionare laid out and ordered within the instruction format, are notthemselves critical to the principles of present invention: the partscould be in any order as might be convenient for the implementation ofthe instruction decoder 204 of the processor 200 (including thepossibility that any part of the instruction such as the opcode andcontrol information may not be in a single continuous sequence of bitssuch as is shown in FIG. 3). The operand specifiers are references toregisters in the set of general purpose registers 220 of processor 200.The first of the operands is a reference to a destination register forstoring the results of the instruction. The second operand is areference to a first source register for the instruction, and the thirdoperand is a reference to a second source register for the instruction.

Referring now to FIG. 3B, there is shown an example of a possibleinstruction format for an instruction to perform trellis encoding inaccordance with mapping schemes specified in international or nationalDSL standards supported by processor 200 in accordance to the presentinvention. The mnemonic for the opcode is shown as “DSLTE”, where DSLTEstands for DSL Trellis Encode. The actual mnemonic used is incidental;for example in another embodiment, an alternative mnemonic for the sameinstruction might be “ADSLTE”, since the trellis encoding schemediscussed above was first specified for ADSL modems. Again it should beobserved that exact details of how this instruction format isimplemented—the size, order and layout of the various parts of theinstruction, exact codes used to represent the DSLTE opcode, etc.—arenot critical to the principles of the present invention. The DSLTEinstruction uses the three-operand instruction format shown in FIG. 3A,and in one embodiment, is defined to take three six-bit operandspecifiers. The first of the operands is a reference to a pair of 64-bitdestination registers for an output “stateout/dataout” where the resultsof the DSLTE instruction are stored. The second operand is a referenceto a first source register for a first input “statein” from which stateinput bits are read, and the third operand is a reference to a sourceregister for the second input “datain” from which input data bits areread. One skilled in the art will realize that the present invention isnot limited to any specific register or location for those registers butthat the instruction of the present invention may refer to an arbitraryregister in the general purpose registers 220.

Thus, by means of this generality of specification, the presentinvention advantageously achieves great flexibility in the use of theinvention. For example, the present invention enables the original data,which is to be trellis encoded, to be obtained from any location chosenby the implementor (e.g. by first loading that data from the memory 222into any convenient register, or it may already be in a register as aresult of a previous processing operation). Likewise, the resultingtrellis encoded data may be placed anywhere convenient for furtherprocessing such as in some general purpose register 220 for immediatefurther operations, or the resulting trellis encoded data may be placedback in memory 222 for later use. Thus, the flexibility of the presentinvention is in sharp contrast to conventional (hardware)implementations of the trellis encoding function, where the data flow isfixed in an arrangement dictated by the physical movement of datathrough the hardware, and cannot be adapted or modified to suitdifferent modes of use.

Similarly, the arrangement and use of separate ‘state’ data values iscompletely unconstrained, but may be arranged according to preferenceand passed in and out for each invocation of the instruction. Thus, theflexibility of the present invention is in sharp contrast toconventional (hardware) implementations of the trellis encodingfunction, where the data flow is fixed in an arrangement dictated by thephysical movement of data through the hardware, and cannot be adapted ormodified to suit different modes of use. For example, typically in suchhardware contexts the ‘state’ (successive values of S) is heldinternally within the trellis encoding hardware, rather than beingpassed in as and when trellis encoding is required. This means thatre-using a hardware implementation to trellis encode multiple distinctdata streams at the same time is either impossible, or certainly morecomplex to implement, since some arrangement must be made to allow theindividual states for the different streams to be swapped in and out.

In one embodiment, the trellis encoding instruction is used in thesoftware on a processor chip or chip-set implementing a central-officemodem end of a DSL link (e.g. ADSL or VDSL). However, one skilled in theart will realize that the present invention is not limited to thisimplementation, but may be equally used in other contexts where datamust be trellis encoded in a substantially similar way, such as in a DSLCPE modem at the customer premise, or in systems not implementing DSL.

In one embodiment, the DSLTE instruction takes as one input a 64-bitvalue comprising the input state bits S(0) for the first trellis stage.In one embodiment of the first input, only the least significant fourbits are used to represent the input state bits. However, one skilled inthe art will realize that the principles of the present invention arenot linked to this arrangement but that the input state bits may beorganized in other ways. The second input operand is also 64 bits insize and comprises the U bits to be encoded. In one embodiment, thesecond input operand comprises two word fields, where a word is a 32-bitquantity. One word (e.g. the lower (least-significant) word) may containthe U bits for a first trellis stage (U(0)), and the other word (e.g.the upper (most-significant) word) may contain the U bits for a secondtrellis stage (U(1)). The U bits in each field may be between 3 and 31bits in length. In another embodiment, simplification of theimplementation of this instruction mechanism can be achieved through theuse of U bits that are not in a contiguous subset of bits within eachrespective word field, but instead are each partitioned into twocontiguous subsets which are presented aligned at the least-significant(right-hand) end of each of the two 16-bit (“half-word”) fields whichmake up the word field. For example, the lower half-word of each wordfield can contain bits {u₁, u₂, . . . , u_(x+1)} of the respective Ubits (U(0) or U(1)) and the upper half-word can contain bits {u_(x+2),u_(x+3), . . . , u_(z)} of the respective U bits. By splitting each ofthe U(0) and U(1) inputs in this way, the instruction mechanism does notneed to take account of the values of x, y, x′, y′ (the lengths of therespective sections of U(0) and U(1)). In this embodiment, the U bits ineach word field may be between 3 and 30 bits in total, with up to 16 Ubits in the lower half-word and up to 14 U bits in the upper half-word.As with the arrangement of data in the first input operand, one skilledin the art will realize that the arrangement of the U bits is notlimited to this description, but may be organized in other ways as well.

The output of the instruction comprises two outputs: a first outputvalue comprising the output state bits S(2) from the second trellisstage, and a second output value containing V(1), W(1), V(2) and W(2).In one embodiment, the first output value comprises 64-bits, of whichonly the bottom four bits contain the output state bits. In anembodiment, the second output value comprises 64-bits, organized as fourhalf-words (16-bit quantities), containing V(1), W(1), V(2), W(2)respectively with each field aligned to the bottom (least-significantend) of its respective half-word. Again, as with the first and secondinput operands, one skilled in the art will realize that the outputs ofthe present invention are not limited to the arrangement describedabove, but may be organized in other ways as well.

In operation, the instruction mechanism is implemented in a processor,such that the instruction mechanism performs a multi-stage (such as2-stage) trellis encoding process for data to be modulated onto aplurality of tones (such as 4 tones) in a single operation whoseexecution is initiated and can also be completed during one cycle. Incontrast, conventionally a processor required the execution of at least10 operations, over multiple cycles, in order to trellis-encode 4 tones.Therefore, the instruction mechanism of the present inventionsignificantly increases the efficiency of trellis encoding of data forsubsequent modulation and transmission.

The core operation performed by the DSLTE instruction mechanism for64-bit first and second input operands as discussed above is describedby the following abstract logic description:stateout.0=statein.1

statein.2

datain.1

datain.32stateout.1=statein.0

datain.33stateout.2=statein.1

statein.3

datain.0stateout.3=statein.2

datain.1stateout.<63..4>=ZEROS(60)dataout.0=datain.2dataout.1=datain.0

datain.2dataout.<14..2>=datain.<15..3>dataout.15=0dataout.16=datain.1

datain.2dataout.17=statein.0

datain.0

datain.1

datain.2dataout.<31..18>=datain.<29..16>dataout.32=datain.34dataout.33=datain.32

datain.34dataout.<46..34>=datain.<47..35>dataout.47=0dataout.48=datain.33

datain.34dataout.49=statein.1

statein.3

datain.0

datain.32

datain.33

datain.34dataout.<63..50>=datain.<61..48>

In the above abstract logic description:

the inputs are statein and datain, in which statein.0 holds S(0)₀,statein.1 holds S(0)₁, statein.2 holds S(0)₂, statein.3 holds S(0)₃,datain.<31..0> holds U(0) and datain.<63..32> holds U(1);

the outputs are stateout and dataout, in which stateout.0 receivesS(2)₀, stateout.1 receives S(2)1, stateout.2 receives S(2)₂, stateout.3receives S(2)₃, dataout.<15..0> receives V(1), dataout.<31..16> receivesW(1), dataout.<47..32> receives V(2) and dataout.<63..48> receives W(2).

In the above description the following definitions apply:

-   -   val.n (where val is an identifier for a linear bit sequence of        one or more bits, such as statein, dataout, etc., and n is a        constant such as 5) means bit n of value val; bit 0 is the least        significant bit, and bit 1 is the next more significant bit,        etc.    -   ZEROS(s) means the linear bit sequence of length s in which all        bits are 0.    -   val.<m..n> (where val is an identifier for a linear bit sequence        and m and n are constants or constant expressions and m≧n) means        the linear bit sequence SEQ(val.m, val.(m−1), . . . val.n).    -   SEQ(a,b, . . . z) means the linear bit sequence resulting from        the concatenation of the listed bit values a, b, . . . z, where        bit a becomes the most significant bit, b the next most        significant bit, etc, and z the least significant bit of the        resulting sequence. The length of the sequence is equal to the        number of bit values in the list.

The above abstract logic description is only one of many possible waysto define logic circuitry to achieve the desired function. The logicalcombination of the various input bits to produce the output bits can bedefined in other ways, for example by sharing the calculation of commonsub-expressions of the above logic equations such as “statein.1

statein.3

datain.0” which appears both as the equation for stateout.2 and as partof the equation for dataout.49. Therefore the above abstract logicdescription is given by way of example only, and other descriptions canbe used as well. One way in which the current invention may beimplemented in the context of a semiconductor chip is by use of logicsynthesis tools (such as the software program ‘BuildGates’ by CadenceDesign Systems, Inc.) to create a logic circuit implementing the corefunction of the DSLTE instruction as defined above. Such tools take asinput a high-level definition in a formal definition language such asVerilog or VHDL; such languages have a general character comparable tothe above abstract logic description, though differing in detail. Askilled artisan can readily use the above abstract logic description tocreate such a high-level definition and thereby create a logic circuitusing such tools.

While the invention has been described with reference to certainembodiments, it will be understood by those skilled in the art thatvarious changes may be made and equivalents may be substituted withoutdeparting from the scope of the invention. In addition, manymodifications may be made to adapt a particular situation or material tothe teachings of the invention without departing from its scope.Therefore, it is intended that the invention not be limited to theparticular embodiment disclosed, but that the invention will include allembodiments falling within the scope of the appended claims.

1. A method for trellis encoding data for subsequent modulation onto oneor more pairs of tones, the method comprising: receiving a first operandincluding state input bits for a first trellis stage; receiving a secondoperand including data input bits for a plurality of trellis stages, theplurality of trellis stages including the first trellis stage and afinal trellis stage; and generating a trellis encoded output forsubsequent modulation onto the one or more pairs of tones based on thefirst operand, the second operand, and data output bits, wherein thetrellis encoded output comprises state output bits from the finaltrellis stage, wherein the data output bits include a plurality of setsof data output bits from the plurality of trellis stages, and whereinthe plurality of sets of data output bits from the plurality of trellisstages are generated substantially simultaneously.
 2. The method ofclaim 1, wherein the step of generating the trellis encoded output forsubsequent modulation onto the one or more pairs of tones comprises:modulating the trellis encoded output for subsequent modulation onto twopairs of tones.
 3. The method of claim 2, wherein the step of generatingthe trellis encoded output comprises: generating a first 64-bit valuefor a first set of data output bits and a second 64-bit value for asecond set of data output bits.
 4. The method of claim 2, wherein thestep of generating the trellis encoded output comprises: generatingstate bits S(2) for the state output bits.
 5. The method of claim 2,wherein the step of generating the trellis encoded output comprises:generating data output bits V(1), W(1) for a first set of data outputbits and data output bits V(2), W(2) for a second set of data outputbits.
 6. The method of claim 2, wherein the step of receiving the firstoperand comprises: receiving a 64-bit value for the first operand. 7.The method of claim 6, wherein the step of receiving the 64-bit valuefor the first operand comprises: receiving the 64-bit value including a4-bit value for the state input bits for the first operand.
 8. Themethod of claim 2, wherein the step of receiving the second operandcomprises: receiving a 64-bit value for the second operand.
 9. Themethod of claim 8, wherein the step of receiving the 64-bit value forthe second operand comprises: receiving a first field of 32-bits and asecond field of 32-bits for the second operand.
 10. The method of claim9, wherein the step of receiving the first field of 32-bits and thesecond field of 32-bits for the second operand comprises: receiving U(0)bits for the first field and U(1) bits for the second field.
 11. Aninstruction mechanism to trellis encode data for subsequent modulationonto one or more pairs of tones, comprising: an instruction decoderconfigured to receive a single instruction; and an execution unitconfigured to trellis encode data for subsequent modulation onto the oneor more pairs of tones in response to the single instruction, thetrellis encoding including: receiving a first operand including stateinput bits for a first trellis stage, receiving a second operandincluding data input bits for a plurality of trellis stages, theplurality of trellis stages including the first trellis stage and afinal trellis stage, and generating a trellis encoded output forsubsequent modulation onto the one or more pairs of tones based on thefirst operand, the second operand, and data output bits, wherein thetrellis encoded output includes state output bits from the final trellisstage, wherein the data output bits include a plurality of sets of dataoutput bits from the plurality of trellis stages, and wherein theplurality of sets of data output bits from the plurality of trellisstages are generated substantially simultaneously.
 12. The instructionmechanism of claim 11, wherein the execution unit is configured togenerate the trellis encoded output for subsequent modulation onto twopairs of tones.
 13. The instruction mechanism of claim 12, wherein afirst set of data output bits comprises a first 64-bit value and asecond set of data output bits comprises a second 64-bit value.
 14. Theinstruction mechanism of claim 12, wherein the state output bitscomprises state bits S(2).
 15. The instruction mechanism of claim 12,wherein a first set of data output bits comprises data output bits V(1),W(1) and a second set of data output bits comprises data output bitsV(2), W(2).
 16. The instruction mechanism of claim 12, wherein theexecution unit comprises: an encoding unit configured to trellis encodethe data, wherein the trellis encoding unit is configured to generatethe trellis encoded output for subsequent modulation onto the one ormore pairs of tones.
 17. The instruction mechanism of claim 12, whereinthe first operand comprises a 64-bit value.
 18. The instructionmechanism of claim 17, wherein the 64-bit value comprises a 4-bit valuefor the state input bits for the first operand.
 19. The instructionmechanism of claim 12, wherein the second operand comprises a 64-bitvalue.
 20. The instruction mechanism of claim 19, wherein the 64-bitvalue comprises a first field of 32-bits and a second field of 32-bitsfor the second operand.
 21. The instruction mechanism of claim 20,wherein the first field of 32-bits comprises U(0) bits and the secondfield of 32-bits comprises U(1) bits.